Stork data scheduler: mitigating the data bottleneck in e-Science.

نویسندگان

  • Tevfik Kosar
  • Mehmet Balman
  • Esma Yildirim
  • Sivakumar Kulasekaran
  • Brandon Ross
چکیده

In this paper, we present the Stork data scheduler as a solution for mitigating the data bottleneck in e-Science and data-intensive scientific discovery. Stork focuses on planning, scheduling, monitoring and management of data placement tasks and application-level end-to-end optimization of networked inputs/outputs for petascale distributed e-Science applications. Unlike existing approaches, Stork treats data resources and the tasks related to data access and movement as first-class entities just like computational resources and compute tasks, and not simply the side-effect of computation. Stork provides unique features such as aggregation of data transfer jobs considering their source and destination addresses, and an application-level throughput estimation and optimization service. We describe how these two features are implemented in Stork and their effects on end-to-end data transfer performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing and Validating a New Wireless Wearable Device for Balance Measurement in Sport and Clinical Setting

One of the new clinical techniques to assess the lower body parameters is the wearable ultrasonic sensors. The device which can measure the static and dynamic balance abilities in sport and clinical setting by the traveled signals of ultrasonic transmitter and receiver between two feet was developed and validated. The new device consisted of a pressure gauge and a digital centimeter indicator a...

متن کامل

Selected papers from the 2010 e-Science All Hands Meeting.

The annual e-Science All Hands Meeting (AHM) is the premier e-Science conference held regularly in the United Kingdom, and provides a forum for the e-Science community to present and demonstrate their research, exchange ideas and socialize. This Theme Issue, entitled 'e-Science: novel research, new science, and enduring impact', features selected papers from AHM 2010 with the aim of highlightin...

متن کامل

Run-time Adaptation of Grid Data Placement Jobs

Grid presents a continuously changing environment. It also introduces a new set of failures. The data grid initiative has made it possible to run data-intensive applications on the grid. Data-intensive grid applications consist of two parts: a data placement part and a computation part. The data placement part is responsible for transferring the input data to the compute node and the result of ...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

Improving for Drum_Buffer_Rope material flow management with attention to second bottlenecks and free goods in a job shop environment

Drum–Buffer–Rope is a theory of constraints production planning methodology that operates by developing a schedule for the system’s first bottleneck. The first bottleneck is the bottleneck with the highest utilization. In the theory of constraints, any job that is not processed at the first bottleneck is referred to as a free good. Free goods do not use capacity at the first bottleneck, so very...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Philosophical transactions. Series A, Mathematical, physical, and engineering sciences

دوره 369 1949  شماره 

صفحات  -

تاریخ انتشار 2011